# Vulnerability Scanning

## Disclaimer
Sample code, software libraries, command line tools, proofs of concept, templates, or other related technology are provided as AWS Content or Third-Party Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content or Third-Party Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content or Third-Party Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content or Third-Party Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

For further guidance on securing your application, refer to the [Security Pillar of the AWS Well-Architected Framework](https://docs.aws.amazon.com/wellarchitected/latest/security-pillar/welcome.html).

## Automated Scans
To ensure security of the project, a number of automated tools have been utilized. Information on each can be found below.

The results of the below tools are valid as of November 24th 2023.
### JavaScript/TypeScript
#### Semgrep
[Semgrep](https://github.com/semgrep/semgrep) scans code and package dependencies for known issues, software vulnerabilities, and detected secrets.

No blocking findings have been found.

#### npm-audit
The audit command for npm (`npm audit`) reports on known vulnerabilities of dependencies configured in a project.

No findings.
### Python
#### Bandit
[Bandit](https://bandit.readthedocs.io/en/latest/) is a tool designed to find common security issues in Python code. Bandit has been run against the Python files written for this project. 

The below outstanding high severity finding remains. This has not been remediated as the code only runs in CodeBuild as part of the deployment process, therefore not exposing it to malicious injection.

```
Issue: [B602:subprocess_popen_with_shell_equals_true] subprocess call with shell=True identified, security issue.
Location: /lib/sagemaker-model/hf-custom-script-model/build-script/script.py:63
```

### CDK Templates
#### cdk_nag
[cdk_nag](https://github.com/cdklabs/cdk-nag/) checks CDK applications for best practices using a combination of available rule packs.

A number of rules have been suppressed. Each suppression is accompanied by a reason.

### Git
#### git-secrets
[git-secrets](https://github.com/awslabs/git-secrets) scans commits, commit messages and `--no-ff` merges to prevent adding secrets into your git repositories. 

No findings.

## A Note on Encryption at Rest for Amazon Aurora
If you are using Amazon Aurora with pgvector as a RAG source, encryption is **not** enabled. Encryption adds an additional layer of data protection by securing your data from unauthorized access to the underlying storage.

If you require encryption at rest, note that you cannot convert an unencrypted DB cluster to an encrypted one. Follow the instructions to apply a patch before first deploying the application to enable it.

1. Copy patch file below
```
diff --git a/lib/rag-engines/aurora-pgvector/index.ts b/lib/rag-engines/aurora-pgvector/index.ts
index b645cd7..30932cf 100644
--- a/lib/rag-engines/aurora-pgvector/index.ts
+++ b/lib/rag-engines/aurora-pgvector/index.ts
@@ -11,6 +11,7 @@ import * as logs from "aws-cdk-lib/aws-logs";
 import * as rds from "aws-cdk-lib/aws-rds";
 import * as cr from "aws-cdk-lib/custom-resources";
 import * as sfn from "aws-cdk-lib/aws-stepfunctions";
+import * as kms from "aws-cdk-lib/aws-kms";
 import { NagSuppressions } from "cdk-nag";
 
 export interface AuroraPgVectorProps {
@@ -26,6 +27,9 @@ export class AuroraPgVector extends Construct {
   constructor(scope: Construct, id: string, props: AuroraPgVectorProps) {
     super(scope, id);
 
+    const storageKey = new kms.Key(this, "StorageKey", {
+      enableKeyRotation: true,
+    });
 
     const dbCluster = new rds.DatabaseCluster(this, "AuroraDatabase", {
       engine: rds.DatabaseClusterEngine.auroraPostgres({
@@ -35,7 +39,9 @@ export class AuroraPgVector extends Construct {
       writer: rds.ClusterInstance.serverlessV2("ServerlessInstance"),
       vpc: props.shared.vpc,
       vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_ISOLATED },
-      iamAuthentication: true
+      iamAuthentication: true,
+      storageEncrypted: true,
+      storageEncryptionKey: storageKey
     });
 
     const databaseSetupFunction = new lambda.Function(
@@ -103,8 +109,7 @@ export class AuroraPgVector extends Construct {
      */
     NagSuppressions.addResourceSuppressions(dbCluster,
       [
-        {id: "AwsSolutions-RDS10", reason: "Deletion protection disabled to allow deletion as part of the CloudFormation stack."},
-        {id: "AwsSolutions-RDS2", reason: "Encryption cannot be enabled on an unencrypted DB Cluster, therefore enabling will destroy existing data. Docs provide instructions for users requiring it."}
+        {id: "AwsSolutions-RDS10", reason: "Deletion protection disabled to allow deletion as part of the CloudFormation stack."}
       ]
     );
   }
```
2. In root of repository, create patch file with copied content
```console
vim aurora_enable_encryption.patch
```
3. Apply patch
```console
git apply aurora_enable_encryption.patch
```
4. Deploy application as usual. Either commit changes to own fork of the repository, or be careful not to overwrite the change.